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ABSTRACT 


A continuous  state  space  model  for  the  problem  of  dynamic  routing  in 
data  communication  networks  has  been  recently  proposed.  In  this  paper 
we  present  an  algorithm  for  finding  the  feedback  solution  to  the  associated 
linear  optimal  control  problem  with  linear  state  and  control  variable 
inequality  constraints  when  the  inputs  are  assumed  to  be  constant  in  time. 
The  Constructive  Dynamic  Programming  Algorithm,  as  it  is  called,  employs 
a combination  of  necessary  conditions,  dynamic  programming  and  linear 
programming  to  construct  a set  of  convex  polyhedral  cones  which  cover 
the  admissible  state  space  with  optimal  controls.  Due  to  several  com- 
plicating features  which  appear  in  the  general  case  the  algorithm  is 
presented  in  a conceptual  form  which  may  serve  as  a framework  for  the 
development  of  numerical  schemes  for  special  situations.  In  this  vain 
the  authors  present  in  a forthcoming  paper  the  case  of  single  destination 
network  problems  with  all  equal  weightings  in  the  cost  functional. 
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I.  INTRODUCTION 

A data  commun i ca t ion  network  is  a facility  which  interconnects  a 
number  of  data  devices,  such  as  computers  and  terminals,  by  communication 
channels  for  tnc  purpose  of  transiai sa i on  oi  uata  between  them.  Cauh 
device  can  use  the  network  to  access  some  or  all  of  the  resources  avail- 
able throughout  the  network.  These  resources  consist  primarily  of 
computat ional  power,  memory  capacity,  data  bases  and  specialized  hard- 
ware and  software.  With  the  rapidly  expanding  role  being  played  by  data 
processing  in  today's  society  it  is  clear  that  the  sharing  of  costly 
computer  resources  is  an  eventual,  if  not  current,  desirability.  In 
recognition  of  this  fact,  research  in  data  communication  networks  began 
in  the  early  1960‘s  and  has  blossomed  into  a sizeable  effort  in  the 
1970's.  A variety  of  data  networks  have  been  designed,  constructed  and 
implemented  with  encouraging  success. 

Ue  begin  our  discussion  with  a brief  description  of  the  basic  com- 
ponents of  a data  communication  network  and  their  respective  functions. 

For  more  detail,  refer  to  [1].  Fundamentally,  what  is  known  as  the 
communication  subnetwork  consists  of  a collection  of  nodes  which  exchange 
data  with  each  other  through  a set  of  connective  links.  Each  node 
essentially  consists  of  a minicomputer  and  associated  devices  which  may 
possess  data  storage  capability  and  which  serve  the  function  of  directing 
data  which  passes  through  the  node.  The  links  are  data  transmission 
channels  of  a given  rate  capacity.  The  data  devices  which  utilize  the 
service  of  the  communication  subnetwork,  known  as  users , insert  data 
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into  and  receive  data  from  the  subnetwork  through  the  nodes. 

The  data  traveling  through  the  network  is  organized  into  messages, 
which  are  collections  of  bits  which  convey  some  information.  In  this 
paper  we  shall  be  concerned  with  the  class  of  networks  which  contain 
message  storage  capability  at  the  nodes,  known  as  3 iore-and- forward 
networks.  The  method  by  which  messages  are  sent  through  the  network 
from  node  of  origin  to  node  of  destination  is  according  to  the  technique 
known  as  message  switching,  in  which  only  one  link  at  a time  is  used 
for  the  transmission  of  a given  message.  Starting  at  the  source  node, 
the  message  is  stored  at  the  node  until  its  time  comes  to  be  transmitted 
on  an  outgoing  link  to  a neighboring  node.  Having  arrived  at  that  node 
it  is  once  again  stored  in  its  entirity  until  being  transmitted  to  the 
next  node.  The  message  continues  in  this  fashion  to  traverse  links  and 
wait  at  nodes  until  it  finally  reaches  its  destination  node.  At  that 
point  it  leaves  the  communication  subnetwork  by  being  immediately  trans- 
mitted to  the  appropriate  user. 

Frequent  use  is  made  of  a special  type  of  message  switching  known  as 
packet  switching.  This  is  fundamentally  the  same  as  message  switching, 
except  that  a message  is  decomposed  into  smaller  pieces  of  maximum  length 
called  packets.  These  packets  are  properly  identified  and  work  their  way 
through  the  network  in  the  fashion  of  message  switching.  Once  all  of  the 
packets  belonging  to  a given  message  arrive  at  the  destination  node,  the 
message  is  reassembled  and  delivered  to  the  appropriate  user.  Hence- 
forth, any  mention  of  message  or  message  switching  will  apply  equally  as 
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well  to  packets  or  packet  swi  telling. 

The  problems  of  routing  messages  through  the  network  from  their  nodes 
of  origin  to  their  nodes  of  destination  is  one  of  the  fundamental  issues 
involved  in  the  operation  of  networks.  As  such,  it  has  received  consid- 
erable attention  in  the  data  comnun i ca t i on  network  literature.  It  is 
clear  that  the  efficiency  with  which  messages  are  sent  to  their  destin- 
ations determines  to  a great  extent  the  desirability  of  networking  data 
devices.  The  subjective  term  "efficient"  may  be  interpreted  mathemat- 
ically in  many  ways,  depending  on  the  specific  goals  of  the  networks  for 
which  the  routing  procedure  is  being  designed.  For  example,  one  may  wish 
to  minimize  total  message  delay,  maximize  message  throughput,  etc. 

In  this  paper  we  shall  restrict  attention  to  the  minimum  delay  message 
routing  •problem. 

In  order  to  arrive  at  a routing  procedure  for  a data-communicat ion 
network  one  must  begin  with  some  representation  of  the  system  in  the  form 
of  a mathematical  model.  As  is  always  the  case,  there  are  a number  of 
important  considerations  which  enter  into  the  choice  of  an  appropriate 
model.  Firstly,  one  wishes  the  model  to  resemble  the  nature  of  the  actual 
system  as  closely  as  possible  — for  instance,  if  the  system  is  dynamic 
the  model  should  be  capable  of  simulating  its  motions.  Secondly,  the 
model  should  describe  the  system's  behavior  directly  at  the  level  in  which 
one  is  interested  — not  too  specific  or  not  too  general.  Finally,  the 
model  should  be  of  some  use  in  analyzing  or  controlling  the  ultimate 
behavior  of  the  system. 
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These  issues  pose  challenging  problems  in  the  formulation  of  models 
which  are  to  be  used  as  a basis  for  the  design  of  message  routing  procedures 
for  data  communications  networks.  The  basic  problem  is  that  there  is  no 
natural  model  which  describes  the  phenomenon  of  data  flow  in  such  a 
network  since  the  nature  of  this  flow  is  largely  opendent  upon  the  char- 
acter of  the  routing  procedure  to  be  developed. 

In  this  paper  we  do  not  confront  the  question  of  modelling  message 
flow  but  rather  base  our  analysis  on  a model  proposed  by  A.  Segal  1 in  [2]. 
This  model,  which  is  a continuous  dynamical  state  space  description  of 
message  flow,  was  formulated  in  order  to  overcome  some  basic  deficiencies 
in  previous  models  which  are  based  upon  queueing  Theory.  The  fundamental 
advantages  of  this  model  with  respect  to  previous  models  are  discussed  in 
detail  in  [2]  and  are  presented  here  briefly.  Firstly,  the  model  may 
accommodate  completely  dynamic  strategies  (continual  changing  of  routes 
as  a function  of  time)  whereas  previous  techniques  have  been  addressed 
primarily  to  static  strategies  (fixed  routes  in  time)  and  quasi-static 
strategies  (routes  changing  with  intervals  of  tine  that  are  long  compared 
to  the  time  constants  of  the  system).  Next,  the  model  can  handle  closed- 
loop  strategies,  where  the  routes  are  a function  of  the  message  congestion 
in  the  network,  in  contrast  to  the  open-lcjp  strategies  of  static  proced- 
ures, in  which  the  routes  are  functions  only  of  the  various  parameters  of 
the  system.  Finally,  the  independence  assumption,  regarding  message 
statistics,  which  is  required  in  order  to  derive  routing  procedures  based 
upon  queueing  theory,  is  not  required  to  derive  procedures  based  upon  the 
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the  dimension  of  the  augmented  problem  may  grow  unreasonably  large  and 
that  even  unconstrained  state  linear  optimal  control  problems  may  be 
difficult  to  solve  efficiently.  In  the  same  paper,  an  alternative 
technique  is  suggested  whereby  the  problem  is  formulated  as  a large 
linear  program  via  time  discretization  of  the  dynamics  and  the  constraints. 
However,  this  technique  also  encounters  the  problem  of  high  dimensionality 
when  the  time  discretization  is  sufficiently  fine  to  assure  a good  approx- 
imation to  the  continuous  problem.  Besides,  neither  of  the  above  tech- 
niques provide  explicitly  for  feedback  solutions. 

In  [2]  an  approach  is  suggested,  oy  way  of  a simple  example,  for 
constructing  the  feedback  solution  to  the  linear  optimal  control  problem 
associated  with  message  routing  when  all  the  inputs  to  the  network  are 
assumed  to  be  zero.  The  purpose  of  this  paper  is  to  elaborate  upon  this 
approach  by  extending  it  to  the  general  class  of  network  problems  with 
inputs  which  are  constant  in  time.  An  algorithm  is  presented  for  the 
construction  of  the  feedback  solution  which  exploits  the  special  struc- 
ture of  the  problem. 

We  begin  by  presenting  in  Section  II  the  model  of  [2]  and  the 
associated  optimal  control  problem  for  closed-loop  minimum  delay  dynamic 
routing. 

The  necessary  conditions  of  optimality  for  general  deterministic 
inputs  are  developed  in  Section  III  and  shown  to  be  sufficient.  It  is 
immediately  seen  that  the  costate  variables  may  experience  jumps  when  the 
associated  state  variables  are  on  their  boundaries  and  that  the  costates 
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are  possibly  nonunique.  Also,  the  optimal  control  is  of  the  bang-bang 
variety  and  may  also  exhibit  nonuniqueness.  We  subsequently  restrict 
consideration  to  the  case  in  which  the  inputs  are  constant  in  time  and 
present  a controllability  condition  for  this  situation.  A special  prop- 
erty regarding  the  final  value  of  the  costates  is  also  presented. 

In  Section  IV  we  define  special  subsets  of  the  state  space  known 
as  feedback  control  regions.  Associated  with  each  such  region,  in 

principle,  is  a set  of  controls  which  are  optimal  for  all  the  states 
of  the  given  region.  Feedback  control  regions  are  shown  to  be  convex 
polyhedral  cones,  and  the  goal  is  to  construct  enough  of  these  regions 
to  fill  up  the  entire  admissible  state  space.  We  demonstrate  in 
Section  V how  this  may  be  achieved  for  two  simple  examples,  and  generalize 
the  notion  in  Section  VI  into  the  constructive  dynamic  programming 
concept.  The  basic  idea  is  to  utilize  a certain  comprehensive  set  of 
optimal  trajectories  fashioned  backward  in  time  from  the  necessary  cond- 
itions In  order  to  construct  the  feedback  control  regions.  An  algorithm 
is  then  presented.  In  conceptual  form,  for  the  realization  of  the 
constructive  dynamic  programming  concept.  Several  of  the  basic  comput- 
ational techniques  associated  with  the  algorithm  are  presented  In  Appendices 
A and  B.  Discussion  and  conclusions  are  found  in  Section  VII. 

Several  complicating  features  of  the  algorithm  render  it  too  dif- 
ficult to  compute  numerically  for  general  network  problems.  In  [5]  and 
a forthcoming  paper  by  the  authors  it  is  shown  that  for  a class  of  problems 
Involving  single  destination  networks  these  complicating  features  disappear. 
For  this  case  it  is  possible  to  formulate  the  algorithm  in  a form  suitable 
for  numerical  computation. 
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II.  THE  MODEL  FOR  DYNAMIC  ROUTING  IN  DATA  COMMUNICATION  NETWORKS 

We  now  describe  the  model  presented  in  [2],  For  a network  of  N nodes 
let  N denote  the  set  of  nodes  and  L the  set  of  'inks.  All  links  are 
taken  to  be  simplex  and.  (i,k)  denotes  the  link  connecting  node  i to 
node  k with  capacity  C.^  (in  units  of  traffic/unit  time).  Attention 
is  restricted  to  the  case  in  which  all  the  inputs  to  the  network  are 
deterministic  functions  of  time.  The  message  flow  dynamics  are  given  by: 


xj(t) 
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k€E(i) 


(t)  + 


l€l(i) 


(t) 
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where 


x"j(t)  = continuous  state  variable  which  approximates  the  amount  of 
data  traffic  (measured  in  messages,  packets,  bits,  etc.) 
at  node  i at  time  t whose  destination  is  node  j,  i^j. 

a-j(t)  = instantaneous  rate  of  traffic  input  at  node  i at  time  t 
with  destination  j. 

u-j^(t)  = control  variable  which  represents  that  portion  of  C.^ 
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We  have  the  positivity  constraints 

xj  (t)  5s  0 (2) 

uJ"  (t)  > 0 
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and  the  link  capacity  constraints  are 


2 uJk(t)  < Cjk,  (I,:;)  EL,  J 6 M . (*) 

j ^ 5 ' 


The  goal  is  to  empty  the  network  of  its  current  message 
storage  in  the  presence  of  inputs  in  such  a fashion  as  to  minimize 
the  total  delay  experienced  by  all  the  messages  traveling  through 

the  network.  Consider  the  cost  functional 
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[ [ 2 a-jxj  ( t)  ]dt 

tJ  i .jew  ' 

o j^i 


(5) 


where  t^  is  such  that 


xj(tf)  - 0 i i j € N,  j + i . (6) 

It  is  demonstrated  in  [2]  that  when  a-!  - 1 Vi  ,j  E W,  j j*  i , expression 
(5)  is  exactly  equal  to  the  total  delay.  Priorities  may  be  incorporated 
by  taking  the  weightings  or!  to  be  unequal. 

For  convenience  we  define  the  column  vectors  x_,  u^,  a_,  C_  and  a_  to 
be  consistently  ordered  concatenations  of  the  state  variables,  control 
variables,  inputs,  link  capacities  and  weightings  respectively.  In 
this  paper  we  shall  not  be  concerned  with  the  particular  ordering. 

Denote  n - dim(x)  ■ dim(a)  ■ dim(a),  m ■ dim(u)  and  r ■ d i m ( C_) . 
Equation  ( 1 ) - (6)  may  then  be  expressed  in  the  vector  form: 
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Dynamics 

x(t)  = B.  u_(t)  + a(t) 

(7) 

Boundary  Conditions 

x(tQ)  " ?Q  ; x.(tf)  * o_ 

(8) 

S f vv»/  )1  ^ J 

x(t)  >0  vt  e [tQ,:f] 

(3) 

f 5 y(t)  < c 

Control  Constraints  U j 

vt  e [to,tf] 

k u(  t)  >0 

(10) 

*f 

Cost  Functional 

J = a x( t) dt 

(ID 

t 
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In  (7)  is  the  n*m  incidence  matrix  composed  of  0's,  +l‘s  and 
-I's  associated  with  the  flow  equations  (1)  and  D is  the  r*m  matrix 
composed  of  0's  and  l's  corresponding  to  (A).  We  now  express  the 
linear  optimal  control  problem  with  linear  state  and  control  variable 
inequality  constraints  which  represents  the  data  communication  network 
closed-loop  dynamic  routing  problem: 


: itimal  Control  Problem 


Find  the  set  of  controls  £ as  a function  of  time  and  state 


u(t)  - u(t,x) 


t e [t0.tf] 


that  brings  any  initial  condition  x_(tQ)  = xQ  to  the  final  cond- 
ition x_(t^)  - 0_  and  minimizes  the  cost  functional  (11)  subject 
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to  the  dymvniea  (7)  and  the  state  :v\d  control  variable  inequality 
constraints  (9) -(10). 

Several  assumptions  have  been  made  in  order  to  facilitate  the 

modelling  jnd  solution.  These  are  now  discussed  briefly. 

(i)  Continuous  state  vaidaldes.  Strictly  speaking,  the  state  var- 
iables are  discrete  with  quantization  level  being  the  unit  of  traffic 
selected.  The  assumption  Is  Justified  by  recognizing  that  any  single 
message  contributes  little  to  the  overall  behavior  of  the  network; 
therefore,  it  is  unnecessary  to  look  individually  at  each  of  the 
messages  and  its  length. 

V 

(11)  Deterministic  inputs.  Computer  networks  almost  always  operate  in 
a stochastic  user  demand  environment.  It  is  suggested  in  [2]  that  the 
deterministic  approach  may  take  stochastic  inputs  into  account  approx- 
imately by  utilizing  the  ensemble  average  rates  of  the  inputs  to  gen- 
erate nominal  trajectories.  Also,  valuable  insight  into  the  stochastic 
situation  may  be  gained  by  solving  the  more  tractable  deterministic 
problem. 

(ill)  Centralized  Controller.  This  is  implied  by  the  form  of  the 
control  law  u_(t)  - u_(t,x).  This  assumption  may  be  valid  in  the  case  of 
small  networks.  Also,  obtaining  the  optimal  strategy  under  this  assumption 
could  prove  extremely  useful  in  determining  the  suboptimality  of  certain 


decentralized  schemes. 
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(iv)  Infinite  capacity  buffers.  Message  buffers  are  of  course  of 
finite  capacity  This  may  be  taken  into  account  by  imposing  upper 
bounds  on  the  state  variables,  but  this  is  not  done  in  the  current 

ana  lysis. 

(v)  All  state  variables  go  to  zero  at  t ^ . During  normal  network, 
operation  the  message  backlogs  at  the  nodes  almost  never  go  to  zero. 
Our  assumption  may  correspond  to  the  situation  in  which  one 

wishes  to  dispose  of  message  backlogs  for  the  purpose  of  temporarily 
relieving  congestion  locally  in  time. 


I 
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III.  FEEDBACK  SOLUTION  FUNDAMENTALS 

We  begin  by  presenting  the  necessary  conditions  of  optima) i ty 
for  the  general  deterministic  inputs  case. 

Theorem  I (Necessary  Conditions) 

Let  the  scalar  functional  h be  defined  as  follows: 

h(u(t),  X(t))  = X_T(t)x(t)  = X_T(  t)  [ i£(  t)  +«_(:)].  (13) 

A necessary  condition  for  the  control  law  u*(-)  6 U to  be  optimal 
for  problem  (7)  - (12)  is  that  it  minimize  h pointwise  in  time, 
namely 

iT(t)B  u*(t)  <A.T(t)B  u(t)  (14) 

Vu(t)  € U Vt  e (to,tfl. 

The  costate  A_(t)  is  possibly  a discontinuous  function  which  satisfies 
the  following  differential  equation 


-dx^(t)  = adt  + dn_(t)  , t € [tQ,tf] 


(15) 


wi  ire  componentwise 
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dn(  t) 


satisfies  the  following  complementary 


s 1 ackness 


x-j  (t)dn{  tt)  - 0 ) Vt  fc  [t  ,t,] 

• * [ O I 


dn ! (t ) ^ 0 


i ,j  £ W,  j 4 i 


(ih) 

(17) 


The  terminal  boundary  condition  for  the  costate  differential 
equation  is 


A( 1 f ) m 1 free  ( 1 3) 

and  the  t ransversa 1 i ty  condition  is 

V.T(tf)x(tf)  - 0.  (ig) 

Finally,  the  function  h is  everywhere  continuous,  i.e. 

h(u(t  ),  A_(t  ))  - h(u(t+),i(t+))  Vt  € [tQ,tf].  (20) 

Proof : In  [61  a generalized  Kuhn-Tucker  theorem  in  a Banach  space 

for  the  minimization  of  a differentiable  function  subject  to  inequality 
constraints  is  presented.  For  our  problem,  it  calls  for  the  formation 
of  the  lagrangian 


J 
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♦ j At) 

t 

o 


[ P u u ) 


f T ■ > \ T . 

t j dn.  I ; ) x t s ) *•  v_  x (t  f ) 

J 

t 

o 


.1(1) 


x^O  ] d i 


(21) 


where  n is  an  n ' 1 vector  adjoining  the  state  constraints  which 
satisfies  the  complementary  slackness  condition  at  optimality: 


j dn ^ 1 1 ) x ( \ 1 « 0 

t 

o 


(22) 


dn_(x)  vO  Vt  € [t  , t(.  ]. 


(23) 


The  vector  v which  adjoins  the  final  condition  is  an  n*1  vector 
of  arbitrary  constants. 

For  u* ( • ) to  be  optimal  J must  he  minimized  at  u*(«),  where 
x(*).  J^(t^)  and  t ^ are  unt30*i8tJ\i:n</d  and  u£U.  Taking  the 

differential  of  J with  respect  to  arbitrary  variations  of  x(  * ) , 
x^(t^)  and  t f.  we  obtain 


dJ 


I 

■*  | a^5x(i)di  > a^x(t^)dt^ 

lf  *f 

j \J(t)5x(t)dt  ♦ j dtjTu)  Sx(  x)  ♦ ^Jdx(t^) 


(24) 


If  we  integrate  the  term  j (i)x_(r)dr  by  parts  in  equation 


(21)  and  substitute  equations  (22)  and  (27) — ( 29 ) into  (21)  we  obtain 
cf 

Ja|^(T)[Bjj(T)+a_(t)]dT.  (31) 

t 

o 

In  order  for  J to  be  minimized  with  respect  to  u(-)  £ U,  the  term 
( x ) 3_u(r)  must  clearly  be  minimized  pointwise  in  time,  that  is 


\T(r)B  U*(0  < XT  ( t ) B u(t) 


Vjj(t)  £ U,  t £ [ tQ , t ^ • 


Thus,  we  have  accounted  for  Equation  (14),  leaving  only  (20)  to  be  proven. 

To  this  end,  let  us  assume  th3t  we  have  an  optimal  state  trajectory 
x*(t)  and  associated  costate  trajectory  X_(t),  t £ [t  ,t^].  Then  by  the 

principle  of  optimality,  for  any  fixed  tp  the  functions  x*(t)  and 


\(t),  t £ 

rt0.  t]. 

are 

opt i ma 1 

the  state 

from  x 

to 

x(x). 

-o 

apply  on 

UQ.  t] 

wi  th 

x(tf) 

the  state  from  xq  to  x_(t)  . Hence,  all  of  our  previous  conditions 

apply  on  [tQ,  t]  with  x(t^)  » x(t).  Applying  the  t ransversa 1 i ty  con- 
dition (2$  ) at  tj:  = r,  we  obtain 


XT(t)x(t)  » -aTx(t). 


Since  Equation  (33)  holds  for  all  t £ [t  , t^]  and  x(t)  is  everywhere 
continuous,  then  \_^(t)x(t)  must  be  everywhere  continuous.  This  proves 
Equation  (20). 
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We  shall  now  describe  Che  behavior  of  the  costate  variables  as 

functions  of  the  correspond] ng  state  variables.  We  distinguish  between 

the  case  when  x?  > 0 (x-j  is  said  to  be  on  an  interior  arc)  and  when 

x^  = 0 (x?  is  said  to  be  on  a boundary  arc)  . When  x?  i s on  an 
i 1 

interior  avc  Equation  (16)  implies  dp?  — 0 and  Equation  (15)  can 
be  differential  with  respect  to  time  to  obtain 

-X-!(t)  = ci-?  . (3*0 

I I 

When  x.  is  on  a boundzz^j  its  ccstaie  is  possibly  discont- 

inuous, depending  upon  the  nature  of  tv*  • At  points  for  which  r\.  is 
absolutely  continuous  we  define  u?(t)  = dry?(t)/dt.  Differentiating  (15) 
with  respect  to  time  and  taking  into  account  (16)  and  (17)  we  obtain: 

-X-j  (t)  - a]  + p](t)  uj(t)<0.  (35) 

On  the  other  hand,  at  times  when  n?  experiences  a jump  of  magnitude 
An?  we  have  from  Equations  (15) -(17)  that  X-j  experiences  the  jump 

AX?  - -An?  >0.  (36) 

It  is  not  difficult  to  see  that  the  costate  vector  may  be  non- 
unique for  a given  optimal  trajectory  — this  is  a fundamental  charac- 
teristic of  the  state  constrained  problem.  Previous  works  such  as  [6] 
have  found  this  nonuniqueness  to  occur  in  costates  corresponding  to 
state  variables  which  are  on  boundary  arcs.  However,  due  to  the  fact 
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that  in  our  case  the  pointwise  mi n i ni ca t i on  is  a linear  program,  this 
nonuniqueness  may  also  be  exhibited  by  costates  corresponding  to  state 
variables  which  are  on  interior  arcs.  This  behavior  is  demonstrated  in 
Example  3-5  of  [5]>  pages  1 86- 1 S3 . 

In  general,  any  trajectory  which  satisfies  a set  of  necessary 
conditions  is  an  extremal,  and  as  such  is  merely  a candidate  for  an 
optimal  trajectory.  Fortunately,  in  our  problem  it  turns  out  that  any 

such  extremal  trajectory  is  actually  optimal,  as  is  shewn  in  the  following 
theorem. 

Theorem  2 The  necessary  conditions  of  Theorem  1 are  sufficient. 

j 

Proof . Let  t),  £*(t),  A_(t)  and  n(t)  satisfy  (7)  "(10)  and  the 
necessary  conditions  of  Theorem  1.  Also  let,  >c(t)  and  u.(t)  be  any 
state  and  control  trajectory  satisfying  ( 7 ) ~ ( 10) - If  we  consider 
SJ  = J(x)  - J ( x* ) we  have 

£f 

5J  = | £T(t)(xp(t)  - _X * ( T ) ) d T . (37) 

t 

o 

Substituting  from  (15)  and  expanding  we  obtain 
lf 

SJ  = | (-£T(t)  d\_(i)  - xj ( t ) dn_( T ) (38) 

to  T T 

+ x*  (r)dX^(t)  + x.*  ( t ) dn_(  t ) ) . 
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From  Equation  (22) 

t. 


x*  (i)dn(t)  = 0. 


(39) 


• __  . ■ ■ . - * i n £ ( "3  ^ 

* e n ow  i n t L n ci  first  a n vj  t.ii  r j Lcr  o on  . - r • 9 • • •-  ^ * 0 o o < 

by  parts,  take  into  account  x(t  ) “ X;:(t0)  = 2s(tf)  = £.::(tf)  = 0_ 
and  finally  substitute  fron  (7)  to  to  obtain 

tf  Cf 

dJ  = j iT(T)B(u(t)  - uft(x))dt  - J xr(-)dl(t). 


(40) 


But  by  (14) 


uf 

| X^t)  B^(t)  - l£*(T))dT  >0 
^o 

and  since  x(t)  >0  and  dn(t)  <0  we  have 


(41) 


f 

j J(T)dn(t)  <0. 


(42) 


Therefore,  <5 J > 0 Vu_(')  € U,  ;<(•)  >0 


From  inequality  (14)  of  the  necessary  conditions  we  see  that  the 
optimal  control  function  u;-(*)  is  given  at  every  time  x £ [t  ,t^] 
by  the  solution  to  the  following  linear  program  with  decision  vector  jj(x) 

u*(t)  = ARG  MIN  [X ' ( x)  B_  u_(x)  ] . 

u(t)£U 


(43) 
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This  is  a fortuitous  situation,  since  much  is  known  about  charac- 
terizing and  finding  solutions  of  linear  programs  in  general.  We  know, 
for  instance,  that  optimal  solutions  always  lie  on  the  boundary  of  the 
convex  polyhedral  constraint  region  U.  However,  for  the  special  forms 
of  the  matrices  B and  0 which  correspond  to  our  network  problem 
we  may  proceed  immediately  to  represent  explicitly  the  solution  of  the 
pointwise  (in  time)  linear  program.  The  minimization  can  actually  be 
performed  by  considering  or*2  ZirJc  ac  a time.  Consider  the  link  (i,k) 
and  a possible  set  of  associated  controls: 


uik’Uik’' 


,U. 


i-1  i+l 


ik 


u. 


ik 


N 

’Uik 


a given  control  variable  may  appear  in  one  of  the  two  following  ways: 


1)  w!,  enters  into  exactly  two  state  equations: 

i k 


y<t)  - k ( t ) +...+  aj 


«£<*>  ■ +uik(t)  *•••♦  ai 

2)  u.^  enters  into  exactly  one  state  equation: 


m 


*k-  x k k 

x.U)  = -u.k  +...+  a. 


(45) 


Hence,  all  controls  on  link  (i,k)  contribute  the  following  terms 
to  xTb  u : 


(46) 


where  *£(0  ■ 0- 
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The  quantities  which  determine  the  optima)  controls  are  the  coef- 
ficients of  the  form  (\j[(t)  * vj(t))  which  multiply  the  control  u^. 

The  only  situation  is  which  it  is  optimal  to  have  u^  strictly  positive 
is  if  (XJ(t)  - \j(t))  <0.  In  terms  of  the  network,  this  says  that  it 

is  optimal  to  send  messages  with  destination  j from  node  i to  node  k 

at  time  t only  if  the  costate  associated  with  x-j  at  time  t is  greater 
than  or  equal  to  that  associated  with  x£  at  tine  t.  This  suggests  an 

analogy  between  the  frictionless  flow  of  fluid  in  a network  of  pipes  in 

which  flow  occurs  from  areas  of  higher  pressure  to  areas  of  lower  pressure, 
and  the  optimal  flow  of  messages  in  a data  communication  network,  in  which 

IT 

flow  occurs  from  nodes  of  "higher  costate"  to  nodes  of  "lower  costate". 

By  way  of  analogy  to  pressure  difference  we  refer  to  ( ( t ) - xj(t))  as 
the  coatate  difference  which  exists  at  time  t on  link  (i,k)  and  is 
associated  with  destination  j.  Therefore,  it  is  to  send 

messages  of  a given  destination  only  in  the  direction  of  a negative  (or  zero) 
ooatate  difference. 

If  the  costate  difference  on  link  (i.k)  associated  with  destination 
j Is  strictly  negative  and  less  then  all  the  other  costate  differences  on 
this  link,  then  the  optim  I control  is  u-Jk  - Cjk  and  all  other  controls 
are  zero.  However,  when  two  or  more  costate  differences  on  the  same  link 
are  non-positive  and  equal  the  associated  optimal  control  will  not  be 
uniquely  determined.  In  such  a situation  the  optimal  solution  set  is  in 
fact  infinitely  large.  The  actual  computation  of  the  optimal  control  at 
time  t requires  knowledge  of  X_(  t ) , which  in  turn  requires  knowledge  of 
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the  optimal  state  trajectory  for  time  greater  than  or  equal  to  t. 

This  is  the  central  dilemma  in  the  application  of  necessary  conditions 
in  the  determination  of  a feedback  solution.  In  order  to  overcome  this 
difficulty,  we  shall  subsequent  7y  *'■  'enei.ieri  no  only  the  situation  in 
uhioh  all  the  inputs  fa!  (t ) Vi  ,j  £ ,\J,  j/  I)  are  eons  tar.*  functions 
of  tins  over  the  interval  of  interest  t € U .t^l.  From  the  network 
operation  point  of  view,  one  can  conceive  of  situations  In  which  the 
inputs  are  regulated  at  constant  values,  such  as  the  backlog  emptying 
procedure  mentioned  in  Section  II.  From  the  optimal  control  viewpoint, 
constant  inputs  appear  to  provide  us  with  the  minimum  amount  of  structure 
required  to  characterize  and  construct  the  feedback  solution  with 
reasonable  effort. 


We  begin  the  feedback  solution  for  the  constant  inputs  case  by 
presenting  a simple  theorem  which  characterizes  all  those  inputs  which 
allow  the  state  to  be  driven  to  zero  under  given  link  rate  capacity 
constraints. 


r 


Theorem  3 (Controllability  to  zero,  constant  inputs). 

All  initial  conditions  of  the  system  ( 7) - ( 10)  are  controllable  to 
zero  under  constant  inputs  If  and  only  If 

a € Int(fc)  (a  € Rn.  X c Rn) 


X * {*  | -x  “ u and  tj  € li)  c Rn 


where 
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is  the  sat  of  jeasible  ficus  attainable  through  the  available  controls. 
Proof.  See  [51.  pages  69*72. 

We  shall  assume  from  herein  that  the  cent rellabl I i ty  to  zero  condition 
of  Theorem  3 Is  satisfied.  The  following  is  an  easy  consequence  of  Theorem  I 

and  therefore  the  proof  is  omitted. 

Coi_o I lai  y 1^  (constant  inputs)  There  always  exists  an  optimal  solution 

for  which  the  controls  are  piecewise  constant  in  time  and  the  state  trajec- 
tories have  piecewise  constant  slopes. 

The  solution  to  the  constant  input  problem  is  of  the  bang-bang 
variety  In  that  the  optimal  control  switches  intermittently  among  boundary 
points  of  (J.  Also,  in  situations  where  one  or  .more  costate  differences 
are  zero  or  several  are  negative  and  equal,  the  control  is  termed  singular. 
Under  such  circumstances,  the  optimal  control  Is  not  determined  uniquely. 

In  the  solution  technique  to  be  presented,  this  non -uniqueness  will  play 
a major  role. 

(Vfing  to  the  bang-bang  nature  of  the  control,  every  optimal  trajec- 
tory may  be  characterized  by  a finite  number  of  parameters.  We  now 
present  a compact  set  of  notation  for  specifying  these  parameters: 

Definition  I . 

u<*)  ^ {?<>•- 1' -2 -f-1* 

end 

T(*)  4 <*„.*! *f> 
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are  a sequence  of  ootimal  controls  and  associated  control  switch  time 

sequence  which  bring  the  state  x_  optimally  to  0 on  t £ [t  .t^], 

where  u is  the  optimal  control  on  t € [t  ,t  p £ [0,1 f-1]. 

-p  P P+1 

An  additional  property  of  a given  optimal  trajectory  that  shall  be 

of  interest  is  which  state  variables  travel  on  boundary  arcs  and  over 

what  periods  of  time.  This  information  is  summarized  in  the  following 

def i n i t i ons : 


Oef i n i t i on  2 : 

Sp  = [x-|  | xJ(t)  = 0 , t £ [tp.tp+,)} 

is  the  set  of  state  variables  traveling  on  boundary  arcs  during  the 
app 1 ica t i on  of  u 

-p 

Definition 


B(x)  - {8o,B1 , . . . .Bf.j ) 


is  the  sequence  of  sets  8p  corresponding  to  the  application  of  U(x) 
on  T(x).  B(x_)  is  referred  to  as  the  boundary  sequence. 


In  preparation  for  the  development  of  the  feedback  solution  we  present 
the  following  corollary  to  Theorem  1 which  narrows  down  the  freedom  of  the 
costates  at  the  final  time  indicated  by  necessary  condition  (18). 


Corol I ary  2 (constant  inputs) 


If  any  state  variable,  say  x. , is  strictly  positive  on  the  time 
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interval  [t^_^,t^]  of  an  optimal  trajector>,  then  X.(t^)  = 0. 

Proof . Consider  a specific  state  variable  x*!^  satisfying  the  hypothesis. 
By  Corollary  I we  have  x.(l^)  < 0 since  x. (t)  is  constant  for 
T € [ t . tj-].  Therefore,  there  must  exist  a directed  chain  or  links 

from  node  i to  node  k (arbitrarily  denote  them  by  { ( i , i +1 ) ,( i + 1 , i +2) , . , 
(k-1 ,k)  })  carrying  some  messages  with  destination  k,  that  is 


Ui,i>1(tf)  > °>  ui+l,i+2Uf)  > °*  • 


Vl.k(tf}  " °‘ 


We  now  recall  that  messages  may  only  flow  optimally  in  the  direction 

of  a non-positive  costate  difference.  The  sequence  of  costate  values 
k k k 

(Aj(t^),  A (t^),  ...  , A must  therefore  be  non-i ncreas i ng  from 

k k 

i to  k-1  and  since  “ 0 we  must  have  >0.  Consequently, 

all  members  of  the  above  costate  sequence  are  non-negative. 

We  now  proceed  to  show  by  contradiction  that  Aj(t^)  “ Suppose 

k T • 

\ . ( t F ) > 0.  Then  the  transversa 1 i ty  condition  £ X (t,'  (t-)  » 0 implies 
r r j j r - r 

• 2.  2. 

that  there  must  be  at  least  one  x.(t^)  < 0 such  that  V.(t^)  < But 

2.  2. 

the  above  reasoning  applied  to  x.  implies  that  A.(t^)  >0.  Hence  , 
a contradiction. 

□ 
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IV.  geometrical  characterization  of  the  feedback  space  for  constant  inputs 

Our  solution  to  the  feedback  control  problem  shall  be  based  upon 
the  construction  of  regions  in  the  admissible  state  space  to  each  of 
1 h:ch  we  associate  a feasible  control  (controls)  which  is  optimal  within 
that  region.  The  set  of  such  regions  to  be  constructed  will  cover  the 
entire  admissible  state  space,  and  therefore  the  set  of  associated  optimal 
controls  will  comprise  the  feedback  solution.  In  order  to  assist  in  the 
systematic  construction  of  these  regions,  we  focus  attention  on  regions 
with  the  following  property:  when  we  consider  every  point  of  a particular 
tegion  to  be  an  initial  condition  of  the  optima)  control  problem,  a common 
optimal  control  sequence  and  a common  associated  boundary  sequence  apply 
to  all  points  within  that  region.  Formally,  we  define  the  following 
subset  of  IRn: 

Def i ni  t ion  4 : A set  R,R  c IRn  , is  said  to  be  a feedback,  control  region 
with  control  set  Q,llcll,  if  the  following  properties  hold: 

(i)  Consider  any  two  points  x^ , £ Int(R).  Suppose  UCx^  = U with 

associated  switch  time  set  T(x^).  Then  U(x2)  * U for  some  switch 
time  set  T ( • 

(i i ) B(x1)  = B(x2)  . 

(iii)  Any  control  C that  keeps  the  state  inside  R for  a non-zero 
interval  of  time  is  an  optimal  control  and  there  exists  at  least  one 


such  control . 
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A fundamental  geometrical  character i zat ion  of  feedback  control 
regions  may  be  deduced  directly  from  the  necessary  conditions.  This 
interesting  character i zat ion , which  shall  subsequently  be  shown  to  be 
very  useful,  is  given  by  the  following  theorem. 

Theorem  b : The  feedback  control  regions  •'f  Definition  b are  convex 

polyhedral  cones  in  IRn . 

Proof . See  [5],  page  I1A. 

Note  that  Theorem  b applies  for  arbitrary  matrices  B and  D,  not 
only  those  special  to  our  network  model. 


j 
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V.  EXAMPLES  OF  THE  BACKWARD  CONSTRUCTION  OF  THE  FEEDBACK  SPACE 

A basic  observation  with  regard  to  feedback  control  regions  is  that 
they  are  functions  of  the  entire  future  sequence  of  controls  which  carry 
any  meinoe r state  optimally  to  zero.  This  general  dependence  of  the  current 
policy  upon  the  future  is  the  basic  dilemma  in  computing  optimal  controls. 
This  problem  is  often  accommodated  by  the  application  of  the  principle 
of  dynamic  programming,  which  seeks  to  determine  the  optimal  control  as 
a function  of  the  state  by  working  backward  from  the  final  time.  The 
algorithm  to  be  developed  employs  the  spirit  of  dynamic  programming  to 
enable  construction  of  feedback  control  regions  from  an  appropriate  set  of 
optimal  trajectories  run  backward  in  time.  These  trajectories  are 
fashioned  to  satisfy  the  necessary  and  sufficient  conditions  of  Theorem  1, 
as  well  as  the  costate  boundary  condition  at  t^  given  in  Corollary  2. 

We  motivate  the  backward  construction  technique  with  several  two 
dimensional  examples  which  introduce  the  basic  principles  involved. 
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The  network  as  pictured  in  Figure  I has  a single  destination,  node  3; 
hence,  we  can  omit  the  destination  superscript  "3"  from  the  state  and 
control  variables  without  confusion.  For  simplicity,  we  assume  that  the 
inputs  to  the  network  are  zero,  so  that  the  dynamics  are: 


XjU)  “ -u  1 3 ( c ) “ u)2^  + “-.j(t) 

(47) 

x,(t)  - -u^t)  ♦ U^U)  - U^U) 

with  control  constraints  as  indicated  in  Figure  I.  The  cost  function  is 
the  total  delay 

'f 

D - | l x j (t)  + x2(t) }d t . (48) 


Let  the  vector  notation  be 


We  wish  to  find  the  optimal  control  which  drives  any  state 
x^t^)  ^ £ to  £(tf)  " 2.  while  minimizing  0. 

As  our  intent  is  to  work  backward  from  the  final  time,  we  consider 
all  possible  situations  which  may  occur  over  the  final  time  interval 
[t^  ),tfl  with  respect  to  the  state  variables  and  x2  : 

(i)  x((t)  < 0,  x2(r)-x2(t)”0,  t £ (tf  j,tj.]. 


* 


1 
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This  situation  is  depicted  in  Figure  2.  We  begin  by  considering  the 
time  period  [t^  ^ , t ^ ] in  a general  sense  without  actually  fixing  the 
switching  time  t^  j.  This  is  simply  the  t i ~e  period  corresponding  to 
the  final  bang-bang  optimal  control  which  brings  the  state  to  zero  with 
v u ana  x9(tf)  - - 0.  we  non  Mil  out  to  find  if  thoie 

is  a costate  satisfying  the  necessary  conditions  for  which  this  situation 
is  optimal;  and  if  so,  to  find  the  value  of  the  optimal  control.  The 
linear  program  to  be  solved  on  t 6 [t^_j,t^]  is 

u'-(x)  = ARC  MIN  [X  (x)x.(x)  + \,(t)x  (t)] 
u€U 

= ARG  MIN  [ ( X _ ( x ) - X (x))u  ,(t)  + (X  (x)  - X _(x))u21(  ) 

u£U  r 

- X1(t)u1j(t)  - X2(t)u2j(t)].  (^9) 

Now,  the  stipulation  xf  < 0 tells  us  from  Corollary  2 that  , 

X,(tf)  = 0 (50) 

and  since  x1  is  on  an  interior  arc,  Equation  (3*0  gives 

X t ( t ) = -1  T £ [tf_rtf]  • (51) 

This  is  shown  in  Figure  2.  Now,  since  we  specify  x2  = 0 on  this 
interval,  its  costate  equation  is 

-d  \ 2 ( t ) = I dx  + dn2(t)  (52) 

dn2(x)  **  0 

X2(tf)  = v2  free  x £ [tf_,,tf]  , 


I 

( 
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I 

where  n_  is  a possibly  discontinuous  function.  We  now  submit  that 

the  costate  value  X,(t)  = ^(t)  = 0.  t £ [ t f _ ? . t f ] , satisfies  the 

necessary  cond i t ' ons  and  is  such  that  there  exists  an  optima!  solution 

for  which  x^  < 0 and  - 0.  Firstly,  the  final  condition 

( t .-)  = 0 is  acceptaole  since  tne  necessary  conditions  leave  A.;tc) 

4.  r ^ * 

entirely  free;  also,  the  choice  of  dn^Ct)  = -dx  gives  ^(x)  = 0 

through  Equation  (52).  Now,  the  reader  /nay  readily  verify  that 

\^(x)  = 0,  * £ [t^  j.tf)  is  the  only  possible  value  which  allows  X2(t)  = 0 

optimally  since  ^(t)  > ® and  -X 2 ( T ) < 0 necessarily  imply  that 

x 2 ( c ) < 0 and  x2(x)  > 0 respectively.  With  the  costates  so  determined, 

one  solution  to  (^9)  is 


u(x)  = (0.5,  0,  1.0,  0.5) T 


(53) 


We  emphasize  that  the  above  solution  is  only  one  among  an  infinite 
set  of  solutions  to  (49).  However,  it  is  the  solution  which  we  are 
seeking.  We  now  make  an  important  observation  regarding  this  solution. 
Since  X j (x)  = -1  and  .^(x)  =0  for  x £ [tf_,,tf],  the  control  (53) 
remains  optimal  on  x £ ( , t ^ 3 . But  as  t^_^  "*  ””  ’ 

Thinking  now  in  forward  time,  this  implies  that  any  initial  condition 
on  the  Xj-  axis  can  be  brought  to  zero  optimally  with  the  control 
specified  in  (53).  Therefore,  the  x1  axis  is  a feedback  control  region 
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in  the  sense  of  Definition  4 for  which  we  have: 

R = {x  j x2  = °} 

where 

U = {(0.5,  0,  1.0,  0.5)T;-  (54) 

B = Ux2}} 

0 = (0.5,  0,  1.0,  0.5)T. 

We  have  therefore  determined  the  optimal  feedback  control  for  all 
points  on  the  n^-axis.  This  is  indicated  in  Figure  3. 

Suppose  now  that  we  wish  to  consider  a more  general  class  of  tra- 
jectories associated  with  the  end  condition  under  discussion.  What  we 
may  do  is  to  temporarily  fix  t^_1  and  stipulate  that  the  control  on 

...  i 

L t f _2  * t f _ i ) has  x2  negative;  that  is,  insist  that  x2  "leave  the 

boundary"  backward  in  time.  As  before,  the  initial  time  t^_2  °f  the 

segment  [t^_2,t^_^)  is  left  free.  The  program  to  be  solved  is  (49) 

wi  th  t € [ t^_2  ,t  j ) . Now,  since  x^  is  on  an  interior  arc  across 

tf_j,  by  (34)  its  costate  must  be  continuous  across  t^_j,  that  is 

Vv-l)  - M'f-l*  = cf  - tf-i ' (55) 

Since  (52)  allows  for  only  positive  jumps  of  X 2 forward  in  time, 
we  have 

x2(tf-i)  " A2(tf-l)  = 0 * (56) 
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Also,  since  both  x^  and  are  on  interior  arcs  on  [ t f 2 ’ t f 

Equation  (3*0  gives 


A , (t)  = -1 


A 2 ( r ) = -1 


The  resultant  costate  trajectory  is  depicted  in  Figure  2.  We  now 
perforin  the  minimization  (^9)  for  i £ [t^.  2,t-f  Since 

A 1 (T)  * A 2 ( T ) > t £ • the  solution  is 


so  that 


j( t)  = (0.5,  0 , 1.0,  1 .o)T 


(t)  =»  -1 . 5 ; x2(t)  ■ - 0. 5. 


Therefore,  the  optimal  control  gives  x2(t)  < 0,  which  is  the 
situation  which  we  desire.  Once  again,  we  see  that  the  control  is 
optimal  for  t £ _^].  Since  x]^x2  " upon  leaving  the  Xj 

axis  backward  in  time  the  state  travels  parallel  to  the  line  Xj-3x2“0 
forever.  Now,  recall  that  t^_1  is  essentially  free.  Therefore,  from 
anywhere  on  the  axis  the  state  leaves  parallel  to  x^  “ 3x2  “ 0 with 

underlying  optimal  control  (58).  Thinking  now  in  forward  time,  this 
implies  that  any  Initial  condition  lying  in  the  region  between  the  line 
Xj  - 3x2  * 0 and  the  x^-axis  (not  including  the  x^-axis)  may  be 
brought  optimally  to  the  x^-axis  with  the  control  (58).  See  Figure  3- 
Once  the  state  reaches  the  x^-axis,  the  optimal  control  which  subsequently 
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takes  the  state  to  zero  is  given  by  (53)- 

Based  upon  this  logic  we  may  now  readily  construct  the  following 
feedback  control  region: 

X1 

R = { x | 0 < x-  =>>  — } 

— * o 

where 

U = {(0.5,  0,  1.0,  1 - 0) T.  (0.5,  0,  1 .0,  0 . 5) T> 

8 = {{0},  {x2}} 

fl  = (0.5,  0,  1.0,  1.0) T (60) 

With  the  two  feedback  control  regions  just  constructed  we  have 

, - , X1 

managed  to  fill  out  the  region  { x_  | 0 x2  *5  — } with  optimal  controls. 

(ii)  x2(t)  < 0,  Xj(t)  = Xj(t)  =0,  t £ [t^_j,t^). 

This  situation  is  the  same  as  (i)  with  the  roles  of  x^  and  x2 

simply  reversed.  If  we  let  x2  leave  the  boundary  first  backward  in 

time,  we  may  construct  a feedback  control  region  consisting  of  the 

x2-axis  in  a fashion  analogous  to  that  of  (i).  I f we  subsequently  allow 

Xj  to  leave  the  boundary  backward  in  time,  we  may  construct  the  feedback 

x2 

control  region  {><  | 0 < x^  < -y  }.  These  regions  and  associated  optimal 
controls  are  illustrated  in  Figure  3- 


t 


Xj(t)  < 0,  x2(t)  <0,  t e [tf_1>tf]. 


(iii) 
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We  are  considering  the  situation  in  which  both  states  go  to  zero 
at  tp  Since  x^  and  x?  are  on  interior  arcs  over  this  time  interval, 
Coro  I I ary  2 g i ves 


ytt->  = yy  - o (6D 

and  from  Equation  (3*0 

X , ( r)  = \2  ( = _1  r t (62) 

Hence,  the  costates  are  always  equal  over  this  time  interval.  The  sol- 
ution to  the  linear  program  (49)  on  [ t ^ ,t^]  is: 

u}^(t)  = 1.0  u23 ( t ) - 1.0  (63) 

0 < u]2(t)  <0.5  0 < u2](x)  <0.5 

so  that 

Xj (t)  “ -1.0  - u12(t)  + U2)(T)  (64) 

x2(t)  - -1.0  + u 1 2 ( t)  - u2 j ( t)  . 

In  this  situation  we  have  encountered  non-uniqueness  of  the  optimal 
control  which  we  seek.  The  optimal  values  of  u^2  and  u21  are  com- 
pletely arbitrary  within  their  constraints.  The  optimal  directions  with 

which  the  state  leaves  the  origin  backward  in  time  at  t^  lie  between 
* • • • 

x /x  ■ 3 and  x2/x^  " 3»  that  is,  between  the  lines  xj  ” 3*2  * 0 
and  x2  - 3^1  = 0.  Moreover,  for  any  t £ ( ".t^)  the  entire  set  of 
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1 


controls  and  associated  directions  in  the  state  space  remain  optimal. 

As  before,  we  now  translate  this  information  to  forward  time  and  recog- 
nize that  for  any  point  lying  between  the  lines  xj  "3x2=0  and 
*2  " 3'<.  = 0 (not  including  these  lines)  tfe  complete  set  of  controls 
(63)  is  optimal.  Therefore,  we  may  construct  the  following  feedback 
control  region  (u.  = [a,b]  means  that  any  value  of  Uj  between  a 
and  b is  optimal): 

R = {x  | ~ < x < 3x„  } (65) 

— j l 

where 

u = {((0,0.5),  : :,Cos,  :-0,  J.C)T> 

8 ■ {{x, ,x2i) 

n = {( [0,0.52. £ l,C-5.’,  : ,0. 1 - j)T} 

This  region  is  illustrated  in  Figure  3- 

Having  completed  all  three  cases  in  this  fashion  we  have  filled  up 
the  entire  state  space  with  feedback  control  regions.  The  specification 
of  the  ootimal  feedback  control  is  therefore  comDlete. 

□ Example  1. 

Example  2.  The  network  is  the  same  as  for  Example  1,  but  the  cost 
functional  is  taken  as  the  weighted  delay 

lf 

J ■ | {2x^(0  + x^U)  )dt.  (66) 

t 

o 


I 
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As  in  Example  1,  we  take  the  approach  of  working  backward  from  the 
final  time,  beginning  with  the  three  possible  situations  which  may  occur 


at  that  time. 


(i)  x,(t)  < 0,  x2(t)  = ;2(r)  = 0,  re  [ t f . t f ] 


The  linear  program  to  be  solved  over  the  final  time  interval  t €[t,  j,t^] 
is  (49)  with  \j(t)  and  X„(t)  appropriately  determined.  The  final 
condition  (50)  applies,  but  since  the  weighting  on  is  = 2, 

the  appropriate  differential  equation  for  X.  is 


X,(t)  = -2 


t e , t ^ ] 


Now,  A2(t)  is  determined  in  the  same  fashion  as  in  case  (i)  of 
Example  1.  That  is,  the  value 


*2 (T)  “ A2(t)  =0  t 6 [ tf _ 1 , t f ) 


allows  the  solution  to  (49)  to  be  such  that  x2(t)  - 0,  t £ [t-_^,t^]. 
Consequently,  the  optimal  control  (54)  applies  here.  The  feedback 
control  region  on  the  x^-axis  is  therefore  the  same  as  (54).  See  Figure  4. 

Let  us  now  allow  x2  to  leave  the  boundary  backward  in  time  at 
some  time  t^_j.  *n  this  case  we  have 

XjU'.,)  = X^t*,,)  - 2(tf  - tf-1)  (69) 

X2(tf-1)  “ X2^ tf-1 ^ “ °‘ 
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Since  both  x^  and  x2  are  on  interior  arcs  over  this  interval,  their 
differential  equations  are 

X ^ ( t ) = -2  1 

l T 6 Itf_,.tM)  • (70) 

X. (t)  » -1 

2 J 

Also,  as  before,  all  that  matters  in  the  solution  of  the  linear  program 

is  that  Xj(t)  > A,(x)  >0,  t £ (t^_2*tf_j)*  Therefore,  the  solution 

x, 

is  given  by  (58)  and  the  feedback  control  region  {x_  | 0 < x2  ^ — } 
is  as  specified  in  (60).  See  Figure  4. 

(ii)  x2(t)  < 0,  x j ( t ) = x ^ ( x ) 0,  x £ [tf_j,tf]. 

The  details  of  this  situation  are  depicted  in  Figure  5. 

We  know  from  Corollary  2 that 

x2(tf)  - 0 (71) 

and  from  (34)  that 

x2(t)  - -1  x e [tf_,.tf].  (72) 

We  now  may  find  by  the  process  of  elimination  that  the  only  value  of 
X j ( t)  , t € [tpj.tjr]  for  which  Xj  =*  0 is  optimal  is: 

X j (t)  - X j (x)  - 0 x £ [tf_1,tf].  (73) 
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It  is  easily  shown  that  A^t)  as  given  in  (73)  satisfies  the  necessary 
conditions.  Therefore,  the  solution  to  (49)  is  the  same  as  in  Example  1, 
case  (ii),  and  the  feedback  control  region  on  the  X2_axis  is  assigned 
in  identical  fashion.  See  Figure  4. 


As  the  next  step,  we  new  stipulate  that  Xj  leaves  the  boundary 
backward  in  time  at  t^  j . Since  x2^f  ^ 


X2^tfV  = tf 


- t 


f-V 


(74) 


Since  costate  jumps  can  only  be  positive  in  forward  time,  we  must  have 


A1  (tfV 


0. 


Also,  since  x1 (t)  > 0,  x2(t)  > 0,  t € , 

A , (t)  = -2 

t £ ^f-2  > £f- 1 ^ • 

A2(t)  =■  -1 


(75) 


(76) 


See  Figure  5-  We  now  notice  a fundamental  difference  between  this 
and  the  previous  situations.  At  some  time  before  t^._^  the  sign  of 
(A1  (r)  - A2 ( t ) ) changes,  which  imples  that  the  solution  to  the  linear 
program  changes  at  that  time.  Therefore,  t^_2  's  not  allowed  to  run 
to  -®  , but  is  actually  the  time  at  which  the  costates  cross  and  the 
control  switches.  The  optimal  controls  and  state  velocities  on  either 
side  of  the  switch  are: 
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(0,  0.5,  1.0,  1 .0)T 


x,  - -0.5  ; x,  - -1.5. 


T £ ftf-3,tf-2)  : 

u.  = (C . 5 , 0,  1.0,  .c> 

x,  - -1 .5  ; x,  = -0.5- 


(77) 

(78) 


(73) 


(30) 


The  relationship  between  the  states  x^  and  x_,  at  2 maY  8e 

calculated  as  follows: 


= Xl^tf-I^  + 2(tf-2  “ lf-P 


X2 ( t f_2)  - A2(tf_1)  + (tf_2  - tf _ 1 ) 


but 


(81) 


X1 (tf-j)  " 0 


A2(tf.,)  - (tf_,  - tf) 


The  crossing  condition  .\^(tf2)  = A2  ( t ^ _2  ' 
that 


(82) 

i^olies  from  (8l)  and  (82) 


f-2 


- t 


f-1 


f-1 


(83) 
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Now 


X1  ( *f-2^  = + ^ ' -*  ( 1 f-2  ” ^f-1 


V t-  ) = V 

'2''f-2;  2 


(tf.1)  + l.5(:f.2  - :fr_1 ) 


but 


t,  (tf.,)  = 0.0 


X2*tf-1)  “ 1 * 5 < c f _ ! " tf'1 


Finally,  (83)  and  (84)  give 


x2(tf_2)  " (tf_2)  - 0. 


(84) 


(85) 


(86) 


That  is,  the  switch  of  control  corresponding  to  the  time  t^  ^ always 
occurs  when  the  state  reaches  the  line  (86).  Therefore,  backward  in 
time  the  state  leaves  from  anywhere  on  the  axis  with  optimal 

control  (77)  and  associated  rate  (78).  The  direction  of  travel  is 
actually  parallel  to  the  line  x2  “ 3x^  “ 0.  Upon  reaching  the  line 
x2  - 6xj  = 0,  the  optimal  control  switches  to  (79)  and  the  state  travels 
parallel  to  the  line  x^  - 3x2  ■ 0 forever.  This  sequence  is  illus- 
trated for  a sampled  trajectory  whose  portions  are  labeled  1,2,3 
in  Figures  4 and  5- 

From  these  observations,  the  following  may  be  inferred  by  thinking 
in  forward  time:  The  control  (77)  is  optimal  anywhere  within  the  region 


bounded  by  the  x^-axis  and  the  line  x2  - dx^  = 0,  nob  including  the 
x^-axis  (shaded  in  Figure  4).  The  control  (73)  is  optimal  anywhere 
within  the  region  bounded  by  the  lines  x^  - 6x1  =■  0 and  Xj  - 3x2  = 0 
not  including  the  former  line.  This  region  is  also  indicated  in  Figure  4. 
Therefore,  we  can  construct  the  following  two  feedback  control  regions: 

* * {£  I 0 < *,  < r~  } (87) 

whe  re 

= {(0,  0.5,  1.0,  1.0)T,  (0,  0.5,  0.5,  1.0)T} 

3 = {{0} , (x2n 

tl  = (0,  0.5,  1.0,  1.0)T 

and 

x2 

R = {x.  | -g-  < x1  < 3x2  } (88) 

whe  re 

U - {(0.5,  0,  1.0,  1.0)T,  (0,  0.5,  ’.0,  1.0)T,  (0,  0.5,  0-5,  1 - j)T> 
3 - {{0},  (0),  {x2H 
n - (0.5,  0,  1.0,  I.0)T  . 

Since  the  entire  state  space  has  now  been  filled  up  with  feedback 
control  regions,  the  specification  of  the  feedback  solution  is  now 
comp lete . 


□ Example  2. 


We  new  summarize  the  contents  of  the  preceding  examples.  By  starti 
at  the  final  time  t^.  we  have  allowed  state  variables  to  leave  the 
boundary  x_ = 0 backward  in  time  and  have  computed  the  corresponding 
optimal  trajectories  as  time  runs  to  minus  infinity.  In  the  instances 
when  the  optimal  control  did  not  switch,  we  were  able  to  construct  one 
feedback  control  region.  When  the  optimal  control  did  switch,  as  in 
case  (ii)  of  Example  2,  two  adjacent  feeeback  control  regions  were 
constructed.  By  considering  enough  cases  we  were  able  to  fill  up  the 
entire  state  space  with  feedback  control  regions,  thus  providing  the 
feedback  solution. 

Note  that  all  we  need  for  the  final  specification  of  the  feedback 
solution  are  the  geometrical  descriptions  of  the  feedback  control  region 
(R's)  and  their  associated  optimal  control  sets  (il's).  The  sequences 
of  optimal  controls  (U's)  and  the  boundary  sequences  (8's)  are  involved 


in  an  intermediate  fashion. 
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VI.  THE  CONSTRUCTIVE  DYNAMIC  PROGRAMMING  ALGORITHM 


The  examples  of  the  previous  section  suggest  an  approach  by  which 
the  feedback  solution  to  the  constant  inputs  problem  may  be  synthesized 
in  general : 


The  Constructive  Dynamic  Programming  Concept. 


Construct  a set  of  backward  optimal  trajeczoxn.es,  each  starting 
at  the  final  time  t^  and  running  to  t =*  among  which  all 
possible  sequences  of  state  variables  leaving  the  boundary  backward 
in  time,  both  singly  and  in  combination,  are  represented.  Each 
segment  of  every  optimal  trajectory  (where  a segment  is  that  portion 
which  occurs  on  the  time  interval  between  two  successive  switch  times 
tp  and  n°t  including  £p+^  ts  utilized  in  the  construction 

of  a feedback  control  region  with  associated  optimal  control  set. 
These  feedback  control  regions  are  convex  polyhedral  cones,  and  the 
union  of  all  such  regions  is  the  entire  admissible  state  space. 


The  conceptual  structure  of  an  algorithm  which  realizes  the  cons- 
tructive dynamic  programming  concept  is  now  presented.  Due  to  several 
complicating  features  the  algorithm  as  it  is  presented  here  is  not  in  a 
form  suitable  for  numerical  computation.  Instead,  it  serves  as  a frame- 
work for  the  development  of  numerical  schemes  for  special  simplifying 
situations.  First,  we  present  some  shorthand  notation: 


i' 
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I ' { x ! | xi(i)  '0,  v £ [t  ,t  .)}  is  t ho  set  of 

p it  p p+ I 


state  variables  traveling  on  interior  arcs  on  ft  . t ). 

P P + ' 


•f  ■"  {x!  | x-j  f 5 and  x!  is  designated  to  leave  the 

p i i p i 


boundary  backward  in  time  at  t 1. 


j ~ cardinal i tv  of  1 . 
P P 

P cardinality  of  £ 

P P 


k - the  feedback  control  region  constructed  from  the 
P 

optimal  trajectories  on  the  segment  [t  ,t 

P P+1 


The  algorithm  is  characterized  by  the  recursive  execution  of  a 


basic  oft ~p  In  which  one  or  more  feedback  control  regions  are  constructed 


from  a previously  constructed  feedback  control  region  of  lower  dimension. 


To  describe  a single  recursive  step  of  the  algorithm  we  begin  with  the 


feedback  control  region  K which  has  been  constructed  in  a previous 


step.  On  the  current  backward  optimal  trajectories  the  state  variables 


of 

I 

are  on  interior  arcs 

and  those  of 

8 a re 

on  boundary  arcs. 

p 

0 

P 

Hence  . 

K c IR  p , where 

we 

assume  that 

o < n . 

The  basic  action 

P 

P 

of 

each 

step  of  the  algori 

thm 

is  to  allow  a 

subse  t 

£ of  state 

P 

variables  in  8 to  leave  the  boundary  backward  in  t i me  simultaneously; 

p a 

that  is,  allow  the  state  trajectory  to  leave  K c IR  p and  travel 
a +p  P 

directly  into  IR  p p.  The  set  ofstate  variables  which  are  subsequently 
on  interior  arcs  is  I 

p-1 


De  f i n i t i on  5 : 


De  f i n 1 t i on  b : 


Do  f i n i } : 


Oe  t i n i t i on  8 : 
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In  order  to  formulate  the  algorithm  we  must  make  the  following 
assumption:  is  is  optimal  for  all  of  the  szaoe  variables  in  I 
to  remain  off  of  the  boundary  as  time  runs  to  minus  infinity.  This 
is  equivalent  to  assuming  that  once  a state  variable  reaches  the  boundary 
in  forward  time  it  is  always  optimal  for  it  to  remain  on  the  boundary. 

This  assumption  is  certainly  not  always  valid,  and  a counter-example  is  pre- 
sented in  Example  3-7  of  [5],  p.197-  The  most  general  class  of  problems 
for  which  this  assumption  holds  is  not  currently  known.  However,  in  [53,p.26?, 
it  is  shown  to  be  valid  for  the  specific  class  of  single  destination 
network  problems  with  all  unity  weightings  in  the  cost  functional. 

We  now  provide  the  rule  which  stipulates  the  complete  set  of  steps  which 
is  to  be  executed  with  respect  to  : 

Consider  all  of  the  subsets  of  3^  'j'niah  are  combinations  of  its 
elements  taken  1.2,...  ,n-o  at  a time.  Szezs  are  to  be  executed  for 

p 

£ equal  to  each  one  of  the  subsets  so  determined,  or  a total  of 
n-a 

2 p - 1 steps. 

We  now  describe  a single  step  of  the  algorithm  by  choosing  a particular 
c 8p.  Figure  6 is  used  to  illustrate  this  description. 

STEP  OF  THE  ALGORITHM 

Operation  I Partition  into  subreyicr.s  with  respect  to  £ 

The  definition  of  subregion  is  deferred  until  Operation  3 since  notions 
are  required  which  are  developed  in  the  interim.  Subregions,  like  feedback 
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Figure  6 Construction  of  Successive  Feedback  Control  Regions 
from  Subregion  RpUp) 
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control  regions,  are  convex  polyhedral  cones  an;  the  method  by  which 

the  partition  may  be  performed  is  presented  in  [pj,  p . J 65  - For  the  present, 

let  us  assume  that  R^  has  been  partitioned  into  s subregions  and 
12  s 

denote  them  by  R (£  ) , R (£  ) R (£  ) , w nere  the  dependence  of 

P o op  P P 

the  partition  on  the  set  £ is  indicated  in  parenthesis.  We  now 

P 

perform  the  subsequent  operations  of  the  step  for  each  of  the  s Sub- 
regions  taken  one  at  a time. 


- Operation  2 Consider  the  typical  subregion  We  now  call 


for  the  state  variables  in 


£ to  leave  backward  in  time  from  each 
P 


of  a finite  set  of  points  of  R (£  ) taken  one  at  a time.  This  set  of 

P P 

points  is  denoted  by  X^(£p)  and  as  'n  t^ie  C55e  °f  subregions  the 

definition  is  deferred  until  Operation  3.  Let  us  now  focus  attention 

on  a typical  such  point  x^  € X^(£p).  We  assume  that  x^  has  been 

reached  through  a backward  optimal  trajectory  constructed  from  a sequence 

of  previous  steps,  and  that  the  time  at  which  x^  is  reached  along 

this  trajectory  is  t^.  Associated  with  at  t^  is  some  possibly 

nonunique  set  of  costate  vectors.  We  are  interested  in  only  those 

costate  vectors  which  allow  for  the  optimal  departure  of  the  state  variables 

in  £ from  the  boundary  backward  in  time  at  t , known  appropriately 
P P 

as  leave -the -boundary  aostates.  This  set  may  also  be  nonunique,  in  which 
case  it  will  in  fact  be  infinite.  It  is  shown  in  [5],  p - 1 89 , that  we  need 
only  consider  a certain  finite  subset  of  the  total  I eave-the-boundary 
costate  set  and  a method  for  determining  this  particular  set  of  costate 


A 
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vectors,  or  for  showing  that  no  such  costate  vectors  exist,  is  presented. 

We  assume  now  that  this  set  has  been  found  and  denote  i t by  A . 

P 


- Ope  rat  ion  3 Consider  the  typical  1 eave-the-bounda ry  costate 

X € A . We  now  consider  the  situation  in  which  the  state  variables  in 
-P  P 

•C  leave  the  subregion  R (.£  ) backward  in  time  from  the  point  x . 

P P P -p 

We  note  that  the  set  of  state  variables  which  are  traveling  on  boundary 
arcs  backward  in  time  subsequent  to  the  departure  of  is 

6 =8  I £ and  the  set  on  interior  arcs  is  I , = I u X . 

p-1  p p p-1  p p 

We  must  now  solve  the  following  problem: 


Given  the  state  x and  the  costate  X at  time  t . find  all 
optimal  trajectories  backward  in  time  on  t € (-», tp)  for  which 
x-j(-r)  = 0 for  all  x-j  £ ^p-i  — determine  that  no  such  point 
trajectory  exists. 


According  to  assumption  stated  earlier  in  this  section  it  is  optimal 
for  all  of  the  state  variables  of  I ^ to  remain  off  of  the  boundary 
for  the  entire  time  interval  t £ (*“>,tp).  Therefore,  by  the  necessary 
conditions  (which  are  also  sufficient)  we  know  that  any  (and  all) 
trajectories  which  solve  the  above  problem  must  have  a control  which 
satisfies  the  following,  henceforth  referred  to  as  the  global  optimiz- 
ation problem: 

Find  all 

u_*(x)  ■ ARC  MIN  x7(t)x^t)  » ARG  MIN  X_^(t)B_  u^(t) 
u(t)6U  u(t)£LI 


(89) 
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where 


A(t  ) = A 
- P -P 

-J  vrj  g 7 

i ' i ' P-'- 


dT.*  ( rl  0 


I 

t 


(20) 

(21) 


(22) 


Yt  € ( -«> . t ) . 

D 


Ot:r  task  is  therefore  to  find  all  solutions  to  the  .global  optimiz- 
ation problem  which  satisfv  the  constraints  x"j(")  = 0 for  all 
x-!  € B , and  all  t £ (-»,t  ) or  show  that  no  such  solution  exists. 

i p-1  D 

To  find  solutions  requires  producing  values  of  -yj(f)  such  that  x‘|(:)  = 0 

is  optima!  for  all  x-!  £ 6 , and  a’l  t 6 ).  A method  for 

i d-1  o 

so’ving  this  problem  is  presented  in  Appendix  A. 

If  it  is  shown  that  no  solution  exists  we  immediately  terminate 

this  step.  Or  the  other  hand  assume  that  using  the  technique  of  Appendix  A 

we  have  arrived  at  a sequence  of  optimal  switching  times  and  optimal  control 

sets  on  t £ (-<”,t  ).  Suppose  that  q switches  occur  in  the  optimal 
P 

control  over  this  interval  and  denote  the  times  at  which  the  switches  occur 

by  t t „,t  where  the  control  remiars  unchanaed  from  time 

7 p-q’  p-2’  p-1 

t to  minus  infinity.  Ail  these  switchina  times  the  backward  optimal 

p-q 

trajectory  intersects  the  hypersurfaces  of  various  dimensions  which  separate 
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adjacent  feedback  control  regions.  The  points  of  intersection  are 

referred  to  as  breakpoints  and  the  hypersurfaces,  which  are  convex 

polyhedral  cones  of  dimension  o + o - 1,  are  referred  to  as  breakualls. 

P P 

We  denote  by  w the  breakwall  which  is  encountered  at  the  s-th 
p-s 

switch  time  t and  denote  the  entire  set  of  breakwalls  encountered 
p-s 

on  x 6 (-°°,tp)  by 


p-q 


,w  - ,w 
p-2  p- 


We  shall  show  how  to  construct  W later  on  in  this  operation. 

Define  9.  to  be  the  oomclete  set  of  optimal  controls  on 
p-s  ' 

t 6 [t  ,t  which  satisfy  the  constraints  x-!  (t)  = 0 for  all 

p-s  p~s+i  I 

X-!  e B , , or  formal  ly 
i p-T 

Si  = {u*  I u*  = ARG  MIN  XT(t) B u(t)  , 

P‘S  “ j u€U  - -- 

l xj (t)=0  Vxje8p_1 


t € [t  ,t  where  \(t)  is  determined  by  (90) - (92)  and 

p-s  p-s+1  — 

Xp  ranges  over  all  members  of  “p*- 


Accordingly,  the  collection  of  optimal  control  sets  on  t € (_".tp)  is 
denoted 


(n  _,n 


p-q 


,np-2,np-1} 


where  n is  the  solution  set  which  applies  from  time  t to  minus 

-®  p-q 
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infinity.  We  are  now  able  to  provide  various  details  which  have  been 

left  unspecified  until  now.  First,  the  definitions  of  subregion  and  the  set 

of  ooints  X (X  ) c R (X  ) mentioned  in  Operations  1 and  2. 

P P P P 

Definition  9 : Suppose  the  set  of  state  variables  is  designated  to 

leave  the  feedback  control  region  R^  backward  in  time.  Then  a subregion 

R (X  ) of  R is  the  set  of  a 1 1 those  points  in  R which  have  taken 
P P P P 

as  the  point  of  departure  of  L result  in  a common  s,  and  a common  W. 


Definition  10:  If  no  control  switches  occur  on  t 6 ( - , t ) then  X (X  ) 

P P P 

consists  of  exactly  one  point,  and  this  may  be  znu  point  of  R (X  ) . 

“ P P 

If  one  or  more  control  switches  occur  (i.e.,  one  or  more  breakwalls  are 

encountered)  then  X (X  ) consists  of  exactly  one  point  from  each  edqe 
P P 

of  R (X  ) , where  we  may  choose  anu  point  of  a given  edge. 

P P 


Therefore,  if  no  control  switches  occur  we  have  exhausted  X (X  ) 

P P 

by  the  consideration  of  the  single  point  xp.  On  the  other  hand,  if  one 
or  more  control  switches  occur  then  we  must  repeat  Operations  2 and  3 


for  all  of  the  remaining  points  of  Xp(X^). 


By  the  definition  of  subregion  we  shall  obtain  the  same  collection 
of  optimal  control  sets  fi  and  encounter  the  sane  set  of  breakwalls  W 

However,  the  breakpoint  corresponding  to  a 
given  breakwall  will  in  general  be  different  for  optimal  trajectories 


for  every  point  in  X (X  ) 
P P 


emanating  from  different  points  of  Xp(X^)  or  f°r  different  optimal 
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trajectories  emanating  from  the  same  point.  We  may  now  specify  how  to 
construct  the  breakwalis  from  the  breakpoints:  Find  the  complete  set  of 
breakpoints  occuring  at  the  s-th  s-jitsh  time  which  correspond  to 
extreme  point  solutions  of  ukere  we  consider  trajectories  eman- 

ating from  every  point  of  X^(£^).  Form  the  set  of  rays  in  the  state 
space  which  pass  through  these  breakpoints.  Then  w is  the  convex 

P'S 

hull  of  all  the  rays. 

Operation  h The  sets  and  W obtained  in  the  previous  operation 
are  now  utilized  to  construct  feedback  control  regions.  We  consider  the 
two  cases : 

(i)  q - 0 

In  this  case  fi  - {n_  } and  W - {0}.  Consider  the  linear 

CD 

e 

transformation  "X  ■ “8.  ji  " £.  and  the  convex  polyhedral  set 

Y_„  * I u.  € . For  every  extreme  point  of  form  the  ray 

V°P 

in  IR  which  passes  through  that  extreme  point.  If  there  are  w 

extreme  points  then  denote  the  set  of  rays  by  V_-  - (v^ ,v2 . . ,vu>. 

It  is  readily  seen  that  each  of  these  rays  represents  an  extreme  direction 
of  travel  of  the  optimal  trajectory  in  the  state  space.  Now,  let  Co(*) 
denote  the  convex  hull  function  and  from  the  convex  polyhedral  cone 

R - Co(R  (£  ) U V )/R  (I  ) . 

• P P ® P p 

It  is  proven  in  Appendix  B that  R_^  is  an  feedback  control  region  with 
associated  optimal  control  set  8.^  in  the  sense  of  Definition  k. 

We  refer  to  R_^  as  a non-break  feedback  control  region. 


and 
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( i i ) q > 0 


In  this  case  i2  * (S2  ,12  ....  ,12  - , C2  ,} 

p-q  p-2  p-1 

W = (w  w , ,w  Form  the  sequence  of  q+1 

p-q’  p-2  p-J  3 3 

polyhedral  cones 


adjacent  convex 


? , = Co(R  (£  ) J W ,)/R  (£  ) 

p-i  P'  p'  P-r  P'  p' 


R - = Co(w  . U w ,)/w 

p-2  p-1  p-2  p-1 


R = Co(w  d w )/w  , 

p-q  p-q+1  p-q  p-q  + 1 


R = Co(w  U V ) /w 

-oo  p-q  -co  p-q 


It  is  proven  in  [5],  P-176,  that  R , ,R  R are  feedback  control 

p-l  p-Z  p-q 

regions  with  associated  optimal  control  sets  f.  , ,i2  S2 

3 r p-1’  p-2  p-q 

respectively.  These  are  referred  to  as  break  feedback  control  regions. 
Here  R _ is  the  non-oreak  feedback  control  region  with  associated 
optimal  control  set  See  Appendix  B for  proof. 

• Step  of  Algor i thm 


Note  that  upon  the  completion  of  a single  step  q*l  feedback 
control  regionshave  been  constructed:  exactly  one  non-break  feedback 
control  region  and  q break  feedback  control  regions,  0 <q  < °>. 

We  may  refer  back  to  Example  2 to  find  simple  examples  of  both  type  of 
feedback  control  regions:  the  region  specified  in  (87)  is  a break 
feedback  control  region  and  that  of  (88)  is  of  the  non-break  variety. 
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Having  detailed  a single  step  we  now  discuis  how  the  overall  algor- 
ithm operates.  The  procedure  is  initiated  at  t^  with  the  first  feed- 
back control  region  R^  being  the  origin.  Here  the  set  8^  is  composed 
of  all  the  state  variables  of  the  problem.  We  allow  £ ^ to  range  over 
all  possible  2n-l  non-empty  subsets  of  8^  and  perform  a step  of  the 
algorithm  for  each.  To  this  end  we  know  by  Corollary  2 that  the  values 
of  the  costates  at  tf  corresponding  to  those  state  variables  leaving 
the  boundary  at  t^  are  zero.  The  constrained  optimization  of  Appendix  B 
may  then  be  solved  since  only  those  costates  are  required  which  corres- 
pond to  state  variables  off  the  boundary.  For  each  set  of  state  variable 
leaving  the  boundary  which  is  found  to  have  globally  optimal  trajectories, 
feedback  control  regions  are  constructed  which  range  from  one  dimensional 
(axes  of  IRn)  to  n-dimens Iona  I subsets  of  IRn.  At  each  step  we 
propogate  backward  in  time  an  appropriate  set  of  state  and  costate  trajec- 
tories and  save  the  information  which  is  required  to  execute  subsequent 
steps.  Each  region  of  the  set  thus  constructed  is  used  as  the  starting 
point  for  the  sequence  of  steps  which  builds  new  higher  dimensional  regions. 
This  process  continues  until  all  the  feedback  control  regions  which  are 
constructed  are  n-dimens ional . Note  that  the  complete  set  of  backward 
state  and  costate  trajectories  which  is  constructed  during  the  execution 
of  the  algorithm  will  not  in  general  be  unique  due  to  the  arbitrariness 

in  the  selection  of  the  set  X (£  ) at  each  step. 

P P 

We  point  out  that  the  feedback  control  regions  constructed  during 
a particular  step  may  have  been  constructed  previously.  In  essence,  we 
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are  being  conservative  in  insisting  that  be  set  equal  successively 

to  all  possible  non-empty  subsets  of  3^ , but  no  method  is  currently 
known  for  the  a priori  elminination  of  those  subsets  which  will  produce 
oreviously  constructed  regions.  However,  our  thoroughness  allows  us 
to  state  the  following: 

Theorem  5-  Complete  execution  of  the  constructive  dynamic  programming 
algorithm  will  result  in  the  specification  of  the  optimal  feedback  con- 
trol over  the  entire  admissible  state  space. 

P roof : Feedback  control  regions  are  constructed  for  every  conceiv- 

able type  of  optimal  trajectory  in  terms  of  sequences  of  state  variables 
on  and  off  boundary  arcs.  Moreover,  we  are  finding  the  largest  such  regions 
since  we  are  taking  into  account  all  optimal  controls  corresponding  to 
each  sequence.  Therefore,  the  feedback  control  regions  constructed  must 
cover  the  entire  admissible  state  space. 

□ 

Summarizing,  the  following  questions  which  have  been  left  unresolved 
in  the  current  discussion: 

1)  The  validity  of  the  assumption  that  it  is  optimal  for  all  the  state 
variables  in  I , to  remain  off  the  boundary  as  time  runs  to 

p-1 

minus  i n f i n i ty . 

2)  Partitioning  R^  into  subregions  (Operation  1) 
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3)  Determining  the  leave-the-bouncary  costate  values  (Operation  2). 

A)  Determination  of  global  optimality  (Operation  3 — part  (b)  of 

Append i x A) . 

As  the  algorithm  is  presentee  here  in  principle  only  we  shall  not 
enter  into  details  regarding  off-line  calculation  or  on-line  implement- 
ation. However,  two  points  are  worthy  of  mention.  First,  the  number  of 
steps  to  be  Derformed  and  the  number  of  feedback  control  regions  cons- 
tructed will  be  very  large  for  reasonaple  size  networks.  In  constructing  a 
numerical  version  of  the  algorithm  we  must  therefore  be  concerned  with 
the  efficiency  of  the  various  operations.  Secondly,  a large  amount  of 
computer  storage  will  be  required  to  implement  the  solution  in  real  time. 
The  feedback  control  regions  must  be  specified  by  a set  of  linear  inequal- 
ities which  in  general  may  be  very  large,  and  the  optimal  controls  within 
these  regions  must  also  be  specified.  This  situation  illustrates  the 
tradeoff  which  occurs  between  the  storage  which  is  required  for  the  on- 
line implementation  of  feedback  solutions  calculated  off-line  and  the 
amount  of  calculation  involved  in  the  repeated  on-line  calculation  of 


open-loop  solutions. 


VII.  CONCLUSIONS 


We  have  considered  the  linear  optimal  control  problem  with  linear 
state  and  control  variable  inequality  constraints  proposed  in  [2]  as 
a method  of  analyzing  dynamic  routing  in  data  communication  networks. 

The  conceptual  structure  of  the  Constructive  Dynamic  Programming 
Algorithm  has  been  presented  for  finding  the  feedback  solution  to  this 
problem  when  all  the  irputs  to  the  network  are  assumed  to  be  constant 
in  time.  Several  required  tasks  of  the  algorithm  pose  complex  questions 
in  themselves  and  are  therefore  left  unresolved  here.  These  questions 
are  confronted  in  detail  in  [5]  and  a forthcoming  paper  by  the  authors, 
where  in  the  case  of  single  destination  networks  with  all  unity 
weightings  in  the  cost  functional  simplifications  arise  which  permit 
a numerical  formulation  of  the  algorithm. 
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APPENDIX  A - COMPUTING  BACKWARD  OPTIMAL  TRAJECTORIES 


Consider  the  following  constrained  opzi-nizaticr.  problem  (i.e.  constrained 
in  state)  in  which  the  n • do  not  appear: 

Find  all 


u*(t)  = ARG  MIN 


u(t  )£U  xJei  , 
- I p-1 


X-j  (t)x-j  (t) 


(A.  1) 


subject  to 


x-j  ( x ) 


= 0 


Vx'i  £ 5 

i p- 


(A.2) 


where 


XJ.(t  ) = appropriate  component  of  X ^ 

i p -p 


»■!(,)  - 


Vx.  € I 


i p-1 


Vx  E (-»,tp), 


(A. 3) 

(A.  4) 


The  following  is  presented  without  the  proof,  which  is  trivial: 


Theorem  A. 1 Any  solution  to  the  global  optimization  problem  which 

satisfies  x-)(t)  3 0 for  all  x-!  £ 5 , is  also  a solution  to  the 
I i p - 1 

constrained  optimization  problem. 

We  are  able  to  solve  the  constrained  optimization  problem  immediately 
since  we  know  all  of  the  coefficients  of  (A.l)  and  the  values  of  X-j 
for  x^  £ 8 , are  not  required.  However,  solutions  to  the  constrained 

i p-1 

optimization  problem  may  not  be  solutions  to  the  global  optimization 
problem.  These  observations  suggest  the  following  two  part  approach  to 
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finding  all  solutions  to  the  global  optimization  problem  union  satisfy 
xl  = 0 for  all  xj.  £ 8 . 

i ■ i p-1 

(a)  Find  all  solutions  to  the  constrained  optimization  problem. 

(b)  Produce  values  of  X-j(r),  t € (-«°,t  ),  for  all  x-)  € which  satisfy 

f-So  -lec.-ssary  con  <? ' t * or  s o-d  such  that  all  solutions  to  part  (a) 

are  also  solutions  to  the  global  optimization  problem  or  show 
that  no  such  values  exist. 

The  above  tasks  were  performed  in  a simple  fashion  for  the  examples 
of  Section  V,  where  due  to  the  small  dimensionality  of  the  problems  we 
were  able  to  solve  part  (b)  by  inspection.  Of  course,  this  is  rarely 
possible,  and  a general  method  for  solving  part  (b)  , referred  to  as  the 
determination  of  global  optimality , is  presented  in  [5],  p - 163- 

We  now  turn  our  attention  to  the  solution  of  part  (a).  Taking  into 
account  the  dynamics  (7)  and  integrating  (A. 4)  backward  in  time  from  t^ 
we  may  re-write  (A.l)  - (A. 4)  in  terms  of  the  underlying  decision  vector 
u^  as  fol  lows  : 

u*(t)  = 


i MIN  (c  +TC,)u(t) 

u(t )ew  -°  ■’  - 

(A.  5) 

£ iu(t)  < £ 

(A. 6) 

u(t)  > 0 

(A.7) 

£j£(t)  = -aj  Vxj  £ 8 . 

(A. 8) 

where  b^  = row  of  B corresponding  to 

- 1 — r 3 i 


c = S \J (t  )bJ 

-°  xJ€l  ' P 

i p-1 


i P'l 


and  x is  time  running  backward  from  t to  minus  infinity. 

The  presence  of  the  constraints  (A. 8)  prevents  us  from  immediately 
specifying  the  optimal  solution  at  a given  time  in  terms  of  the  costates 
as  is  possible  in  the  absence  of  these  constraints.  However,  since  for 
fixed  x (A. 5)  - (A. 8)  is  a linear  program  the  Simplex  technique  may 
be  applied  to  find  a solution.  Moreover,  the  cost  function  of  (A. 5)  is 
a linear  function  of  the  single  independent  parameter  x,  while  the 
constraints  are  not  a function  of  i since  a_  is  constant.  This  is 
precisely  the  form  which  can  be  accommodated  by  parametric  linear  prog- 
ramming with  respect  to  the  cost  coefficients.  The  solution  proceeds 
as  fol lows  : 

Set  x = 5,  where  5 is  some  small  positive  number  which  serves  to 
perturb  all  costate  values  by  crld.  We  wish  to  start  our  solution  at 
time  t -6  since  we  may  have  \^(t  ) = 0 for  some  6 6 ,,  so  that 

p i p i p-1 

the  solution  exactly  at  t may  not  correspond  to  x-!  leaving  the  bound- 
ary. The  number  6 must  be  such  that  0 < 6 < t where  t , is 

1 p-1  p-1 

the  first  break  time  to  be  encountered  backward  in  time. 

We  now  use  the  Simplex  technique  to  solve  the  program  at  x = 5. 


There  are  many  linear  programming  computer  packages  which  may  be  enlisted 
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for  this  task  which  utilize  efficient  algorithmic  forms  of  the  Simplex 
technique  to  arrive  at  a single  optimal  extremum  solution.  Given  this 
starting  solution  which  we  call  u^^,  packages  are  also  equipped 

to  employ  parametric  linear  programming  to  find  the  value  of  t for 
which  the  current  solution  ceases  to  be  optimal  as  well  as  a new  optimai 
solution.  These  are  the  break  time  t , and  the  optimal  control  u . 

p-1  -p-2 

respectively.  We  continue  in  this  fashion  to  find  controls  and  break 

times  until  the  solution  remains  the  same  for  : arbitrarily  large. 

This  final  solution  is  the  control  u 


The  linearity  of  the  pointwise  minimization  associated  with  the 
necessary  conditions  has  enabled  us  to  find  a sequence  of  optimal  controls 
on  the  time  interval  (-°»,t  ) by  the  efficient  technique  of  parametric 
linear  programming.  However,  in  the  description  of  Operation  3,  we  call 
for  all  optimal  solutions  on  every  time  segment.  Since  we  are  dealing 
with  a linear  program,  the  specification  of  all  optimal  solutions  is 
equivalent  to  the  specification  of  aH  optimal  extremum  of  the  solution 
set.  Unfortunately,  it  turns  out  that  the  problem  of  finding  all  the 
optimal  extremum  solutions  to  a linear  program  is  an  extremely  difficult 
one.  It  is  easily  shown  that  given  an  initial  optimal  extremum  solution 
this  problem  is  equivalent  to  finding  all  the  vertices  of  a convex  poly- 
hedral set  defined  by  a system  of  linear  equality  and  inequality  constraints. 
Discussion  of  this  problem  has  appeared  intermittently  in  the  linear 
programming  literature  since  the  early  1950's,  where  several  algorithms 
based  upon  different  approaches  have  been  presented.  However,  none  of 


l 
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these  methods  has  proven  computationally  efficient  for  a reasonably 
large  variety  of  problems.  The  fundamental  difficulty  which  appears 
to  foil  many  algorithms,  no  matter  what  their  underlying  approach,  is 
degeneracy  in  the  original  linear  orogram.  As  our  problem  is  charac- 
terized by  a high  degree  of  degeneracy,  one  would  expect  poor  performance 
from  any  of  these  algorithms.  Hence,  it  appears  at  this  time  that  the 
development  of  an  efficient  alqorithn  for  the  solution  of  this  problem 
is  contingent  upon  the  discovery  of  methods  for  resolving  oegeneracy 
in  linear  programming.  As  degeneracy  is  a *r»cuent  nuisance  in  most 
linear  programming  procedures,  this  problem  is  the  subject  of  much  on- 


going research. 


- 69  - 


APPENDIX  8 - CONSTRUCTING  NON-BREAK  FEEDBACK  CONTROL  REGIONS 


Exactly  one  non-break  feedback  control  R is  constructed  in  either  of 
the  cases  q = 0 or  q > I.  If  q = 0 then  the  state  variables  in  X 

F 

leave  R (X  ) backward  in  time  with  optimal  control  set  $ 1 and  R 

p p ~oo-oi 

is  constructed  adjacent  to  R (X  ) . Similarly  if  q > 1 then  the  state 

P P 

variables  in  X leave  the  breakwall  w backward  in  time  with 

p p-q 


is  constructed  adjacent  to  w 

p-q 


optimal  control  set  and  R 

' —CO 

In  this  discussion  it  is  unnecessary  to  distinguish  between  these  cases; 


we  therefore  let  R represent  either  the  subregion  R (X  ) or  the 
P P P 

breakwall  w depending  upon  whether  q = 0 or  q > 1 respectively. 

p-q 


Theorem  B.  1 Suppose  $1  is  the  set  of  optimal  controls  with  which 

the  state  variables  X leave  R backward  in  time.  Then 

P P 

R = Co(R  J V )/R 

-oo  P -OO  p 

is  the  non-break  feedback  control  region  with  associated  control  set 
ft  in  the  sense  of  Definition  4. 

-oo 


Proof.  We  must  show  that  items  (i)-(iii)  of  Definition  4 apply  to 
R and  ft  . The  situation  is  depicted  in  Figure  B.l. 

— OO  — oo 

We  prove  item  (tii)  first.  Consider  x_  £ R . Translate  each  ray 
in  V by  placing  its  origin  at  x^  and  call  the  translated  set 

V1  = {vj  ,v ' ,»'  } . Next  form  the  conical  region  *(x)  - Co(x  U \l'_  ) /x. 

See  Figure  B.l.  If  x(  € A.(x)  , then  there  exists  a direction  which  is 
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some  convex  combination  of  the  members  of  V'  which  takes  x,  to  x. 

-oo  - 1 _ 

Hence,  for  any  x,  6 A.(x)  there  exists  a u G £ which  takes  x,  to  x. 
Now,  R_oo  = Co({A.(x)  | x^  G R^})  since  the  sma’ lest  convex  set  containing 
U(x)  I x G R } is  clearly  R . Therefore,  for  any  x,  6 R there 

p — OO  * - I -oo 

exists  some  direction  which  is  a convex  combination  of  members  of  V 

—CO 

which  carries  x^  to  some  point  x G R^.  This  is  equivalent  to  saying 
that  for  any  x,  6 R , there  exists  a u G ft  such  that  x = B u + a 

— I — OO  — OO  

carries  x^  to  some  point  ><  £ R . Also,  the  trajectory  remains  '.within 
R until  i t s tr i kes  R . 

-oo  p 

Now,  let  us  select  some  x,  G R and  apply  any  control  u,  G 
which  helps  the  state  within  R for  a non-zero  period  of  time  At. 
Clearly  there  exists  such  a control  by  the  above  argument.  Denote  by 
x^  the  state  which  results  after  applying  u^  for  the  time  At.  Then 
also  by  the  above  argument  there  exists  some  control  u_  G D which 

-oo 

takes  x_2  to  some  point  x^  € R^.  See  Figure  B.l.  The  control  u^ 

is  optimal  since  fl  is  constructed  such  that  any  u^  G ^ is  optimal 

to  move  the  state  off  of  Rp  backward  in  time.  Finally,  u^  is  optimal 

since  u^  6 fi^  and  the  trajectory  segment  Xj  -*•  x^  in  part  of  the 

trajectory  x,  -*■  x_  -*•  x,  which  leaves  from  R . We  have  therefore  shown 
-3~2  -1  p 

that  item  (iii)  of  Definition  k is  satisfied. 


I terns  (i)  and  (ii)  follow  easily  from  the  fact  that  R^  is  itself 
part  of  a feedback  control  region. 


□ Theorem  B . I 
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